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ABSTRACT 

Rationale Clinical trial design in interstitial lung 
diseases (ILDs) has been hampered by lack of consensus 
on appropriate outcome measures for reliably assessing 
treatment response. In the setting of connective tissue 
diseases (CTDs), some measures of ILD disease activity 
and severity may be confounded by non-pulmonary 
comorbidities. 

Methods The Connective Tissue Disease associated 
Interstitial Lung Disease (CTD-ILD) working group of 
Outcome Measures in Rheumatology — a non-profit 
international organisation dedicated to consensus 
methodology in identification of outcome measures — 
conducted a series of investigations which included a 
Delphi process including >248 ILD medical experts as 
well as patient focus groups culminating in a nominal 
group panel of ILD experts and patients. The goal was 
to define and develop a consensus on the status of 
outcome measure candidates for use in randomised 
controlled trials in CTD-ILD and idiopathic pulmonary 
fibrosis (IPF). 

Results A core set comprising specific measures in the 
domains of lung physiology, lung imaging, survival, 
dyspnoea, cough and health-related quality of life is 
proposed as appropriate for consideration for use in a 
hypothetical 1-year multicentre clinical trial for either 
CTD-ILD or IPF. As many widely used instruments were 
found to lack full validation, an agenda for future 
research is proposed. 

Conclusion Identification of consensus preliminary 
domains and instruments to measure them was attained 
and is a major advance anticipated to facilitate 
multicentre RCTs in the field. 



BACKGROUND 

The diffuse idiopathic interstitial pneumonias 
describe a spectrum of parenchymal lung diseases 



Key messages 



Why is the key question? 

► Can a core set of outcome measures that are 
reliable and feasible be identified by experts for 
use in future clinical trials in connective tissue 
disease associated interstitial lung disease 
(CTD-ILD) and idiopathic pulmonary fibrosis 
(IPF)? 

What is the bottom line? 

► Using established Delphi and nominal group 
techniques supplemented by patient input, a 
preliminary core set of outcome measures in 
CTD-ILD and IPF have been identified. 

Why read on? 

► To learn the core set of clinically meaningful 
and feasible measures in CTD-ILD and IPF that 
were identified and the gaps remaining. 

sharing clinical, physiological, radiological and 
pathological similarities, including varying degrees 
of fibrosis, inflammation and vascular injury. 1 
Idiopathic pulmonary fibrosis (IPF) is associated 
with usual interstitial pneumonia (UIP), poor sur- 
vival and limited treatment options. 2 Interstitial lung 
disease (ILD), most typically presenting as non- 
specific interstitial pneumonitis, is a leading cause of 
death in systemic sclerosis (SSc) 3 and a prominent 
clinical feature of other connective tissue diseases 
(CTDs), including idiopathic inflammatory myop- 
athy (IIM) and Sjogren syndrome. UIP is also found 
in rheumatoid arthritis (RA) and IIM. 4 5 

Current evaluations of therapies focus on patient 
survival or markers of chronic disease progression, 
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for example, change in forced vital capacity (FVC). 6-8 Measures 
of patient function, for example, 6 min walk test (6MWT), and 
health-related quality of life (HRQoL) have been variably applied 
with inconsistent results. 6 Therapeutic research has been ham- 
pered by lack of consensus on and validation of outcome measures 
that reliably assess the likelihood of treatment response. 
Furthermore, extra-pulmonary CTD manifestations may confound 
measures of ILD activity/se verity. Patient-reported dyspnoea is 
demonstrated to predict time to death, yet a satisfactory dyspnoea 
instrument for ILD has not yet been identified. 7 8 Clinically rele- 
vant, patient-reported outcome measures (PROMs) exist for 
obstructive lung disease and, in the absence of disease-specific 
measures, have been utilised in trials of ILD. 

The Outcome Measures in Rheumatology (OMERACT) filter 9 
(see online supplement) is a dynamic and iterative process/struc- 
ture through which an instrument's performance can be evalu- 
ated under three criteria or points of examination: truth (face, 
content, construct and criterion validity), discrimination (reliabil- 
ity, sensitivity to change) and feasibility (cost, interpretability, 
accessibility, safety, time). The ideal instrument satisfies all three 
while instruments incompletely satisfying the filter may still be 
immediately useful but require additional study. 

The Connective Tissue Disease associated Interstitial Lung 
Disease (CTD-ILD) working group of the OMERACT inter- 
national consensus initiative convened to define outcome mea- 
sures for use in randomised controlled trials (RCTs) in CTD-ILD. 
Given the major clinical overlap, the same process was used in 
parallel for IPF. We report the results of a three-component 
process: medical expert Delphi exercise, patient perspective inves- 
tigations and a combined medical expert and patient participant 
nominal group technique (NGT) meeting leading to identification 
of preliminary core sets of domains with corresponding instru- 
ments that are clinically meaningful and feasible in the context of 
a 1-year multi-centre RCT for each CTD-ILD and IPF. These sets 
of instruments are proposed as the minimum outcome measures 
to be used in future RCTs and registries. 

METHODS 

Medical expert Delphi process 

Delphi 

International experts (n=270) were identified by authorship in 
peer-reviewed journals, specialty society membership and peer 
recommendations, and invited to participate in the web-based 
Delphi process. 10-12 This began with an 'item-collection' stage 
called Tier 0, wherein participants nominated an unrestricted 
number of potential domains (qualities to measure) and instru- 
ments (specific tools for use as a measure) perceived as relevant 
for inclusion in a hypothetical 1-year RCT. This exercise pro- 
duced a list of >6700 items — reduced only for redundancy, 
organised into 23 domains and 616 instruments and supplemen- 
ted by expert advisory teams of pathologists and radiologists. 
The results of Tier 0 provided the content for sequential 
web-based surveys: Tiers 1, 2 and 3 which progressively 
reduced the number of voting items as the items with the lowest 
ratings were dismissed. Survey items for each CTD-ILD and IPF 
were aligned in parallel and rated along a nine-point Likert scale 
from 1 ('not at all important') to 9 ('absolutely important'), 
with 'insufficiently familiar' a voting alternative. An extensive 
online repository of item-related journal articles was available to 
participants throughout the process. 

Analysis 

A cut-off of <4 (median rating) was applied to ratings from the 
large number of voting items in Tier 1. Cluster analyses were 



applied to the ratings in Tiers 2 and 3 avoiding the use of an 
arbitrary cut-off, thus allowing items to aggregate independently 
providing an unbiased analysis of agreement among raters. 12 A 
nine-cluster analysis was initially applied and reduced to three 
clusters for all items during both tiers. 

Patient perspective investigation 

Patient participation is recognised as integral to development of 
outcome measures by OMERACT, the US Food and Drug 
Administration and European Medicines Agency. 9 13 To investi- 
gate the patient perspective in CTD-ILD, a set of qualitative 
studies were conducted: focus groups (60-90 min) of 8-12 con- 
sented participants with CTD-ILD were selected by convenience 
sampling and asked 1) how their life has changed since the diag- 
nosis of their lung disease? and 2) how their lung disease has 
changed over time? Patient perspective data in 20 
English-speaking patients with IPF were previously available. 14 
Content was extracted from verbatim transcripts and inductive 
analysis was applied to minimise investigator bias. 15 Following 
each focus group, CTD-ILD participants (study patients with 
IPF were not available) rated on a seven-point Likert scale the 
importance of the domains identified in Tier 0 of the medical 
expert Delphi process. 

NGT meeting 

At the 2012 OMERACT 11 conference and the 2012 American 
Thoracic Society (ATS) International Conference, data from the 
Delphi and the patient perspective investigations were reviewed 
by medical and patient experts. Following this, a face-to-face 
meeting was held to apply NGT to the overall results. 

At the NGT, evaluation of each domain was led by assigned 
teams of medical and patient participants who presented 
evidence-based reviews focusing on instrument validation in 
accordance with the OMERACT filter. 9 12 Several weeks prior 
to team assembly, interactive educational sessions with the 
patient participants examined each domain and instrument. The 
teams served as a resource for evidence-based information 
during the discussion phases. 

After each team presentation, all participants engaged in a 
'round-robin' discussion allowing equal speaking time per par- 
ticipant 10-12 over two to three rounds examining acceptance or 
rejection of an item, potential clinical endpoint assignment, and 
determination for new instrument development within that 
domain. Each round of discussions was followed by group 
voting. 

All participants were requested to register a vote for each 
item. With participants' full knowledge, responses from all phy- 
sicians and patients with CTD-ILD were tabulated for 
CTD-ILD, with only those from pulmonologists and patients 
with IPF for IPF. All votes were recorded. (The radiologist 
voting was tabulated as a pulmonologist.) A priori, acceptance 
was agreed upon as >70% affirmative votes. 16 Voting addressed 
inclusion/exclusion of items based on the OMERACT filter and 
whether the patient perspective and evidence-based data war- 
ranted the need for new instrument development for that corre- 
sponding domain. 

RESULTS 

Medical expert Delphi 

A total of 254 (137 pulmonologists, 113 rheumatologists and 4 
cardiologists) engaged in the Delphi process. Seventy-four per 
cent reported their primary field of interest being ILD. 
Participation through all stages exceeded 97%. Six domains 
identified were: Dyspnoea, HRQoL, Lung Physiology/Function, 
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Lung Imaging and Survival, and Medications for each CTD-ILD 
and IPF. Eighteen instruments were identified for each 
CTD-ILD and IPF (tables 1-4). 

Focus groups 

Focus groups were conducted with patients (n=45) in IIM-ILD 
(n=ll), RA-ILD (n=13), SSc-ILD (n=17) and other CTD diag- 
noses (n=4) (table 5). Patient participants attributed importance 
to cough, dyspnoea, fatigue, participation (in family, social and 
leisure activities, work within and outside the home), physical 
function, self-care and sleep in the questionnaire and the focus 
groups. Changes in cough were perceived as reflecting potential 
worsening ILD. Dyspnoea largely carried descriptors different 
from current instruments. Patients with IPF identified cough, 
dyspnoea and HRQoL effects as central symptoms. 14 

OMERACT 11 /ATS 2012/Domain Team meetings 

Discussions and voting at the OMERACT 11/ATS 2012/Domain 
Team meetings resulted in the following changes based on the 
patient perspective data or strong evidence in recent literature 
(detailed in online supplement): 

► Cough was reintroduced, discussed and voted upon at the 
NGT. 

► To satisfy the reintroduction of Cough, Leicester Cough 
Questionnaire (LCQ) was introduced as an interim instru- 
ment to assess Cough. 

► The Mahler Dyspnea Index (MDI) and University of 
California San Diego Shortness of Breath Questionnaire 
(UCSD-SBQ) were reintroduced under Dyspnoea for use in 
CTD-ILD and IPF, respectively, based on substantive findings 
in an updated literature review. 

► For feasibility, HRQoL would capture 'fatigue', 'participa- 
tion', 'physical function', 'self-care' and 'sleep' until disease- 
specific investigations into these components were 
conducted. 

► NGT voting would include whether development of new 
instruments for Dyspnoea, Cough and HRQoL are needed. 

► Owing to variability of therapies, concern regarding 
Medications as a core domain was expressed. However, being 
identified as important in the Delphi, a statement of clarifica- 
tion would be constructed at the NGT. 

► 'All-Cause Mortality' was introduced as an assessment of 
'Survival'. 



Table 2 Domain results of Tier 0 


Tier 0 results of 23 domains 


Survival 


Mental health 


Biomarkers 


Sleep 


Imaging 


Global assessment 


Lung physiology/function 


HRQoL 


Lung parenchyma 


Physical function 


Lung vascular 


Participation 


Cardiac function 


Employment/work productivity 


Composite scores 


Medication 


Gastroesophageal reflux 


Extra-pulmonary CTD features 


Cough 


Comorbidities 


Dyspnoea 


Barriers to care 


Fatigue 




CTD, connective tissue disease; HRQoL, health-related quality of life. 



NGT results 

The final NGT panel included 10 pulmonary experts, 12 
rheumatology experts and 1 radiology expert, with 5 patient 
partners (tables 6-8, and see online supplement). 

Table 6 displays the voting results on instruments for 
CTD-ILD and IPF with striking concurrence in all domains 
except for HRQoL, for which Patient Global Assessment (PtGA) 
was not accepted by the pulmonary experts for IPF. 

Tables 7 and 8 present the content of the NGT discussions in 
the context of the OMERACT filter with items of special inter- 
est highlighted below. 

It was agreed that 'Medications' (ie, the incremental increase/ 
decrease of glucocorticoid and/or immunosuppressive therapy) 
should be viewed as protocol specific rather than a core domain. 
Depending on study design, 'Medications' may be either a 
dichotomous interpretation of treatment efficacy/failure or a 
reflection of changes in disease activity. 

The lack of validated biomarkers was fully discussed. No 
items for bio-specimen evaluation emerged from the Delphi 
exercise but the importance of future biomarker research was 
planned for during the meeting. Consensus is required to define 
the minimal standards for investigation-related bio-banking and 
systematic access to samples by investigators. 17 



Table 1 


Reduction of domains and instruments in the Delphi 


process 


















Participant 


Phase 


Analysis 


Domains 


Instruments 


Dropout 


yielded 


method 


CTD-ILD/IPF 


CTD-ILD/IPF 


(%) 


Tier 0 


Intense 


133 


>6700 


0 




review 


nominations 


nominations 








»23 


»616/616 




Tier 1 


<4 median 


21 


71/71 


2 




cut-off 








Tier 2 


cluster 


13 


58/61 


<1 




analysis 








Tier 3 


cluster 


5/5 


18/18 


0 




analysis 









CTD-ILD, connective tissue disease associated interstitial lung disease; IPF, idiopathic 
pulmonary fibrosis. 



Table 3 Results of the Delphi Tier 3 cluster analysis of domains 
with median/mean reported 

Five domains identified for each CTD-ILD and IPF 



IPF (median/mean) 
CTD-ILD (median/mean) ratings on a 9-point 
Domain name ratings on a 9-point scale scale 



Dyspnoea 


(8.0/7.8) 


(8.0/8.1) 


Health-related 


(8.0/7.7) 


(8.0/7.8) 


quality of life 






Lung imaging 


(9.0/8.3) 


(9.0/8.3) 


Lung physiology/ 


(9.0/8.7) 


(9.0/8.7) 


function 






Survival 


(8.0/8.2) 


(9.0/8.4) 


Medications 


(8.0/7.2) 


(7.0/7.3) 



CTD-ILD, connective tissue disease associated interstitial lung disease; IPF, idiopathic 
pulmonary fibrosis. 
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Table 4 Results from Tier 3 of Delphi 






Acceptance 


Domain 


Instrument 


in 




Dyspnoea 


Borg Dyspnea Index 


CTD-ILD 


IPF 




MRC Breathlessness (Chronic Dyspnea) 


CTD-ILD 


IPF 




Scale or the Modified MRC Dyspnea Scale 








Borg Dyspnea Index pre and post exercise 


CTD-ILD 




HRQoL 


Medical Outcomes Trust Short Form 36 


CTD-ILD 


IPF 




health survey 








St George's Dyspnoea Respiratory 




IPF 




Questionnaire 








Visual analogue scale of Patient 


CTD-ILD 


IPF 




Assessment of Disease Activity 








Ability to carry out activities of daily living 


CTD-ILD 


_ 




Health Assessment Questionnaire Disability 


CTD-ILD 


_ 




Index 






Lung imaging 


Extent of honeycombing on HRCT 


CTD-ILD 


IPF 




Extent of reticulation on HRCT 




IPF 




Extent of ground glass opacities on HRCT 


CTD-ILD 






Overall extent of ILD on HRCT 


CTD-ILD 


IPF 


Lung physiology/ 


Supplemental oxygen requirement 


CTD-ILD 


IPF 


function 


FVC on spirometry 


CTD-ILD 


IPF 




Diffusion capacity of lung for carbon 


CTD-ILD 


IPF 




monoxide 








6MWT with maximal desaturation on pulse 


CTD-ILD 


IPF 




oximetry 








6MWT for distance 




IPF 


Survival 


Time to decline in FVC 


CTD-ILD 


IPF 




Progression-free survival 


CTD-ILD 


IPF 




Time to death 




IPF 


Medications 


Increase or decrease in glucocorticoids 


CTD-ILD 


IPF 




Increase or decrease in concomitant 


CTD-ILD 


IPF 




immune suppressive agents 






6MWT, 6 min walk test; CTD-ILD, connective tissue disease associated interstitial lung 


disease; FVC, forced vital capacity; HRCT, high-resolution CT; IPF, idiopathic 




pulmonary fibrosis; HRQoL, health-related quality of life; MRC, Medical Research 




Council. 









DISCUSSION 

These comprehensive international investigations are the first to 
identify core sets of domains in each CTD-ILD and IPF along 
with a provisional consensus on a minimum cadre of feasible 
and clinically meaningful outcome measures/instruments. The 
proposed measures are intended to be a common denominator 
across future RCTs, longitudinal observational studies and 
natural history registries until work can be done that substanti- 
ates a truly durable framework. The rigorous consensus 



methodologies of OMERACT outline the overall status of the 
field. Importantly, this is the first study in ILD to incorporate 
patient participants in panel meetings or guidelines. From the 
synergy of these investigations, domains which require develop- 
ment of new instruments were also identified, thus providing 
guidance for imminent research. 

Based on the current data, FVC (100% acceptance) was the 
measure that the group favoured most for each CTD-ILD and 
IPF. Again, we emphasise that the overarching construct of this 
exercise was limited to that of a hypothetical RCTof 1-year dur- 
ation. FVC has been shown to be a consistently reliable serial 
variable in IPF. Declines in FVC correlate with increased risk of 
subsequent mortality, 4 7 8 18-22 although no data exist demon- 
strating that improvement in FVC correlates with improved sur- 
vival. Thus, utilising FVC as an endpoint requires consideration 
of the clinically meaningful magnitude of change independent 
of potential impact on mortality. This is particularly relevant in 
studies of short duration. 

While changes in FVC have been shown to be reproducible in 
SSc-ILD, there are insufficient RCT-derived data to evaluate this 
in other forms of CTD-ILDs. 3-5 20 There are confounding 
issues of vasculopathy, pulmonary hypertension, cardiac involve- 
ment, chest wall impairment and systemic disease activity that 
are often coexistent in CTD-ILDs. Nonetheless, FVC may most 
reliably and sensitively reflect the contribution of parenchymal 
disease above other endpoints. 

Though a relative change from baseline predicted is pre- 
ferred to absolute change from normal values, these changes 
are recognised as non-parametric in FVC. Thus a discrete 
clinically relevant threshold of minimal change was not able 
to be agreed upon in either IPF or CTD-ILD. Further, efforts 
to validate serial variables are challenged by variations in the 
rate of disease progression, with interval changes of FVC 20 22 
more likely to represent a true change in rapidly progressive 
disease than in less progressive disease that crosses the same 
threshold. Extrapolation between two value points will 
provide less reliable information than continuous variables; 
therefore, identification of a minimal clinically important dif- 
ference (MCID) would be misleading without accommodating 
for these non-parametric changes. Panel discussions surround- 
ing Diffusion Capacity of Lung for Carbon Monoxide 
(DLCO) reflected the multiple confounders for this instru- 
ment, with ranking of FVC as being the favoured marker 
above DLCO. A threshold of clinically meaningful change was 
not determined for DLCO. 



Table 5 


Characteristics of patients with CTD-ILD participating in the focus groups 


















Age (years) 




Group 


CTD type 


Location 


Participants 


Gender 


Mean (SD) 


Race 


1 


Various 


Winnipeg, Manitoba, Canada 


9 


8 F, 1 M 


53.6 (16.2) 


8C, 1 0 








1 MM, 2 RA, 














4 SSc, 2 SLE 








2 


RA 


Toronto, Canada 


7 


7 F, 0 M 


64.3 (9.0) 


4 C, 2 A, 1 AC 


3 


SSc 


Baltimore, Maryland, USA 


6 


3 F, 3 M 


58.2 (9.1) 


6C 


4 


MM 


Baltimore, Maryland, USA 


7 


4 F, 3 M 


52.4 (10.5) 


5 C; 2 AA 


5 


Various 


New Orleans, Louisiana, USA 


9 


6 F; 3 M 


53.8 (15.5) 


4 C; 4 AA; 1 H 








3 MM, 4 RA, 














1 SjS, 1 SLE 








6 


SSc 


New Orleans, Louisiana, USA 


7 


5 F; 2 M 


54.6 (5.7) 


4 AA; 3 C 



A, Asian; AA, African American; AC, African Caribbean; C, Caucasian; CTD-ILD, connective tissue disease associated interstitial lung disease; F, female; H, Hispanic; MM, idiopathic 
inflammatory myopathy; M, male; 0, other; RA, rheumatoid arthritis; SjS, Sjogren's syndrome; SLE, systemic lupus erythematosus. 
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Table 6 Results of nominal group proceedings with percentage 


for acceptance (see online supplement for expanded voting tables) 




CTD-ILD 


IPF 




PULM+RHEUM+patients 


PULM+patient 


Instrument 


with CTD-ILD 


with IPF 


Dyspnoea 






MRC Chronic 


7/9+9/12+2/3=75% 


10/11+1/1=92% 


Dyspnea Scale 






Dyspnea 1 2 


8/10+11/12+3/3=88% 


6/9+1/1=70% 


UCSD-SBQ 


N/A 


7/9+1/1=80% 


Cough 






Leicester cough 


7/10+10/12+2/2=79% 


8/10+1/1=82% 


questionnaire 






HRQoL 






Short Form 36 


10/10+11/11+3/3=100% 


8/10+1/1=82% 


SGRQ 


9/10+9/11+2/2=87% 


8/10+1/1=82% 


VAS-PtGA 


10/10+11/12+2/2=96% 


N/A 


Lung imaging 






Overall extent of ILD 


11/11+9/11+3/3=92% 


10/10+1/1=100% 


on HRCT 






Lung physiology 






Forced vital capacity 


10/10+11/11+3/3=100% 


10/10+1/1=100% 


Diffusion capacity of 


10/10+8/10+3/3=91% 


10/10+1/1=100% 


lung 






Survival 






All-cause mortality 


Unanimous agreement 


Unanimous 
agreement 


CTD-ILD, connective tissue disease associated interstitial lung disease; HRCT, 


high-resolution CT; HRQoL, health-related quality of life; IPF, idiopathic pulmonary 


fibrosis; MRC, Medical Research Council; PtGA, Patient Global Assessment; PULM, 


pulmonary specialist; RHEUM, rheumatology specialist; SGRQ, St George's Respiratory 


Questionnaire; UCSD-SBQ, University of California San Diego Shortness of Breath 


Questionnaire; VAS, visual analogue scale. 





Neither the 6MWT nor measures of oxygen desaturation sur- 
vived the NGT process; although deemed feasible they were 
considered weak in discrimination in addition to construct and 
criterion validity. The need for supplemental oxygen was not 
accepted; changes in oxygenation, as judged partly by oxygen 
desaturation, are difficult to interpret since they do not correlate 
well with the sensation of dyspnoea or changes in disease pro- 
gression in mild to moderate disease. 19 23 

The importance of patient-reported dyspnoea for assessing 
prognosis and disease progression are well recognised. 1 7 8 We 
identified the Dyspnea 12 24 and the Medical Research Council 
Dyspnea Scale 18 19 as the best currently available instruments in 
CTD-ILD and in IPF, yet data are essentially lacking in 
CTD-ILD. Though the MDI has some demonstrated validity in 
SSc-ILD 20 , NGT panelists allocated this interviewer- 
administered instrument to the research agenda for CTD-ILD, 
voicing concerns of poor feasibility and uncertain reliability. 
The UCSD-SBQ was accepted for use in studying IPF. 21 It was 
agreed that development of new Dyspnoea instruments is war- 
ranted to specifically reflect the restrictive lung processes of 
CTD-ILD and IPF. 

The Short Form 36 (SF-36) was recognised as a generic 
HRQoL instrument as anxiety, fatigue, participation, physical 
function, self-care and sleep are important to patients. 25 The St 
George's Respiratory Questionnaire, although endorsed, lacked 
specificity in CTD-ILD and IPF. 26 27 It was agreed that a new 
disease-specific instrument should be developed. 

PtGA, previously validated across rheumatic and non- 
rheumatic diseases, correlates with dyspnoea in CTD-ILD 28 29 
and was accepted as a measure in CTD-ILD with improvements 
greater than 10 mm agreed upon as an MCID. PtGA not being 
validated in IPF was allocated to the research agenda in IPF. 
PtGA may also serve as an 'anchor' to determine MCIDs for 



Table 7 Relation of CTD-ILD preliminary core set instruments to aspects of OMERACT filter in CTD-ILD 



Lung Lung 



CTD-ILD 


Dyspnoea 




Cough 


HRQoL 






physiology 




imaging 


Survival 




Instruments 


D-12 


MRC 


LCQ 


SGRQ 


SF-36 


PtGA 


FVC 


DLCO 


HRCT— overall 
extent of disease 


All-cause 
mortality 


Time to decline 
in FVC 


Truth 
























Face validity 


Y 


Y 


Y 


Y 


Y 


Y 


Y 


Y 


Y 


Y 


Y 


Content validity 


Y 


Y 


Y 


Y 


Y 


Y 


Y 


Y 


Y 


Y 


Y 


Construct validity 


Y 


Y 


NT 


Y 


Y 


NT 


Y 


± 


Y 


Y 


NT 


Criterion validity 


NT 


NT 


NT 


NT 


NT 


NT 


No 


No 


Y 


Y 


NT 


Discrimination 
























Discriminatory 


Y 


Y 


NT 


Y 


Y 


NT 


± 


± 


Yes, except± for GGO 


No 


Y 


Reliable 


Y 


Y 


NT 


NT 


Y 


NT 


Y 


N 


Yes, except± for GGO 


Y 


NT 


Reproducible 


NT 


NT 


NT 


NT 


NT 


NT 


Y 


± 


Y 


N/A 


NT 


Sensitive to change 


Y 


Y 


NT 


NT 


Y 


NT 


Y 


± 


Yes but relatively slow 


N/A 


Y 


Feasibility 
























Cost effective 


Y 


Y 


Y 


Y 


Y 


Y 


Y 


Y 


Y 


No* 


Y 


Interpretability 


Y 


Y 


Y 


Y 


Y 


Y 


Y 


Y 


Y 


Y 


Y 


Readily available 


Y 


Y 


Y 


Y 


Y 


Y 


Y 


Y 


Y 


Y 


Y 


Safe for patients 


Y 


Y 


Y 


Y 


Y 


Y 


Y 


Y 


± 


Y 


Y 


Patient-derived contentt 


Y 


No 


No 


No 


No 


N/A 


N/A 


N/A 


N/A 


N/A 


N/A 



PtGA is adopted under HRQoL, though it is an independent instrument. 

*Not cost effective as a primary efficacy endpoint but highly cost effective as a secondary endpoint to detect treatment toxicity — see text for discussion on 'survival' 
tUS Food and Drug Administration advocates patient-reported instruments be developed by qualitative data supplied by patients. 18 19 

±, ambiguous; CTD-ILD, connective tissue disease associated interstitial lung disease; D-12, Dyspnea-12; DLCO, diffusion capacity of lung for carbon monoxide; FVC, forced vital 
capacity; GGO, ground glass opacity; HRCT, high-resolution CT; LCQ, Leicester Cough Questionnaire; MRC, Medical Research Council Dyspnea Scale; N/A, not applicable; NT, not yet 
tested; OMERACT, Outcome Measures in Rheumatology; PtGA, Patient Global Disease Activity; SGRQ, St George's Respiratory Questionnaire; SF-36, Short Form 36; Y, yes. 
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Table 8 Relation of IPF preliminary core set instruments to aspects of OMERACT filter in IPF 



IPF 

Instruments 


Dyspnoea 




Cough 




HRQoL 




Lung 

physiology 


Lung 
imaging 
HRCT— overall 
extent of disease 


Survival 

All-cause 

mortality 


D-12 


MRC 


UCSD-SBQ 


LCQ 


SGRQ 


SF-36 


FVC 


DLCO 


Truth 






















Face validity 


Y 


Y 


Y 


Y 


Y 


Y 


Y 


Y 


Y 


Y 


Content validity 


Y 


Y 


Y 


Y 


Y 


Y 


Y 


Y 


Y 


Y 


Construct validity 


Y 


Y 


Y 


NT 


Y 


Y 


Y 


Y 


Y 


Y 


Criterion validity 


NT 


NT 


NT 


NT 


NT 


NT 


No 


No 


Y 


Y 


Discrimination 






















Discriminatory 


NT 


NT 


Y 


NT 


NT 


NT 


± 


± 


Y 


No 


Reliable 


NT 


NT 


NT 


NT 


Y 


Y 


Y 


N 


Y 


Y 


Reproducible 


NT 


NT 


NT 


NT 


Y 


NT 


Y 


± 


Y 


N/A 


^oncitix/o tn rhanno 

JCllolllVC LU LlldllLJC 


NT 


NT 


Y 


NT 


Y 


Y 


Y 


Y 


Yoc hilt ro I ati\/ol\/ clr»A/ 
ico uui icidiivciy oiuvv 


N/A 


Feasibility 






















Cost effective 


Y 


Y 


Y 


Y 


Y 


Y 


Y 


Y 


Y 


No* 


Interpretability 


Y 


Y 


Y 


Y 


Y 


Y 


Y 


Y 


Y 


Y 


Readily available 


Y 


Y 


Y 


Y 


Y 


Y 


Y 


Y 


Y 


Y 


Safe for patients 


Y 


Y 


Y 


Y 


Y 


Y 


Y 


Y 


± 


Y 


Patient-derived contentt 


Y 


No 


No 


No 


No 


No 


N/A 


N/A 


N/A 


N/A 



*Not cost effective as a primary efficacy endpoint but highly cost effective as a secondary endpoint to detect treatment toxicity — see text for discussion on 'survival'. 
tUS Food and Drug Administration advocates patient-reported instruments be developed by qualitative data supplied by patients. 18 19 

±, ambiguous; D-12, Dyspnea-12; DLCO, diffusion capacity of lung for carbon monoxide; FVC, forced vital capacity; HRCT, high-resolution CT; IPF, idiopathic pulmonary fibrosis; LCQ, 
Leicester Cough Questionnaire; MRC, Medical Research Council Dyspnea Scale; N/A, not applicable; NT, not yet tested; OMERACT, Outcome Measures in Rheumatology; SGRQ, St 
George's Respiratory Questionnaire; SF-36, Short Form 36; UCSD, University of San Diego Shortness of Breath Questionnaire; Y, yes. 



recently developed PROMs, such as the King's Brief ILD Health 
Assessment Questionnaire (K-BILD). 30 

The extent of ground-glass opacities, honeycombing and/or 
reticulations on high-resolution CT (HRCT) scan each merited 
careful consideration as outcome measures. However, taken sep- 
arately each was felt to incompletely capture disease progression 
in either CTD-ILD or IPF. The overall extent of ILD on HRCT 
was accepted to provisionally describe the most appropriate and 
feasible composite of radiological abnormalities to monitor for 
disease progression. 31 32 No specific assessment tool at this time 
was able to be confidently identified as it is not yet clear 
whether subjective or automated objective assessment is the 
more accurate approach. Though serial HRCT raises concern 
for patient safety, validation studies of less radio-intense 
methods of HRCT serial assessment 33 are underway. 

Progression-free survival in IPF was agreed to have merit, 34 
however the group was undecided as to the practicality of this 
endpoint in the context of a trial limited to 1 year's duration. 
Mortality was minimal or absent in two recent RCTs of 
SSc-ILD. 35 36 There are cogent arguments for and against survival 
as the primary outcome in studies of IPF. 34 37 Regardless of this 
unresolved debate, mortality was recognised as an essential end- 
point in all treatment trials as it provides a harm signal, 34 37 with 
all-cause mortality identified as a valid measure of survival in 
CTD-ILD and IPF. The utility of other measures of progression- 
free survival in RCTs requires further investigation of candidate 
instruments before recommending their use in RCTs. 

While the domain of Cough did not survive the Delphi 
process, it was important to patient participants. Additionally, 
there is a correlation between cough and IPF progression 38 and 
with ILD severity in SSc. 39 In SSc-ILD, cough adversely 
impacted HRQoL and improved with treatment. 39 The LCQ 
was selected as an interim measure as it was deemed more able 
to capture frequency, quality and intensity, and impact on 
HRQoL. It was also most feasible to administer. 40 41 



Primary and secondary endpoint status of the proposed mea- 
sures were considered, intensely discussed and even voted upon 
during the NGT. However, at this preliminary stage and given 
the lack of full validation of the core measures, the consensus 
was to pursue further data. A more careful approach to end- 
point status declarations entails ad hoc and prospective perform- 
ance analyses of these measures. 

Though we recommend these proposed measures for all 
future research ventures, continued use of measures outside this 
core set, for clinical practice and research purposes, is fully 
expected with further research into their performance antici- 
pated and necessary. Rather, this endeavour defines the currently 
available, best validated and feasible instruments while providing 
a much needed prioritised research agenda focus to the research 
community. 

This project applied rigorous multi-investigational processes 
that captured the perspectives of the international ILD expert 
community and the life experience of patients with ILD to iden- 
tify a set of domains and measures. Participation remained 
robust through all tiers of the consensus process. 

The importance of patient participation is supported by the 
incorporation of HRQoL, Participation and Fatigue in the RA 
core set for RCTs. From a practical perspective, qualitative data 
collection involved only English-speaking patients from North 
America, and results may be affected by cultural, environmental 
and resource-related effects requiring further investigations to 
follow up our reported findings. Nevertheless, the engagement 
of patients as partners in the iterative process was important in 
identifying and re-capturing areas of potentially meaningful 
measures of disease activity. 

CONCLUSIONS 

It is critical that valid and clinically useful instruments be devel- 
oped and validated to assess the likelihood of treatment 
response in these disorders. Identification of consensus 
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preliminary domains and instruments to measure them was 
attained and is a major advance anticipated to facilitate multi- 
centre RCTs in the field. However, none of the provisional end- 
points were ultimately felt to be either ideal or fully validated. 
Feasible endpoints like FVC are not perfect; more rigorous end- 
points like mortality, particularly in the setting of CTD-ILD, 
lack feasibility. Thus, selecting the best non-ideal endpoints 
from a larger group of non-ideal endpoints still leaves us with 
much work which includes further validation of existing and 
development of new instruments. 
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